A very low bit rate speech coder using HMM-based speech recognition/synthesis techniques

نویسندگان

Keiichi Tokuda

Takashi Masuko

Jun Hiroi

Takao Kobayashi

Tadashi Kitamura

چکیده

This paper presents a very low bit rate speech coder based on HMM (Hidden Markov Model). The encoder carries out phoneme recognition, and transmits phoneme indexes, state durations and pitch information to the decoder. In the decoder, phoneme HMMs are concatenated according to the phoneme indexes, and a sequence of mel-cepstral coefficient vectors is generated from the concatenated HMM by using an ML-based speech parameter generation technique. Finally we obtain synthetic speech by exciting the MLSA (Mel Log Spectrum Approximation) filter, whose coefficients are given by mel-cepstral coefficients, according to the pitch information. A subjective listening test shows that the performance of the proposed coder at about 150 bit/s (for the test data including 26 % silence region) is comparable to a VQ-based vocoder at 400 bit/s (= 8 bit/frame 50 frame/s) without pitch quantization for both coders.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the performance of HMM-based very low bit rate speech coding

In this paper, we define an F0 quantization scheme for a very low bit rate speech coder based on HMM (Hidden Markov Model). In the coding system, the encoder carries out phoneme recognition, and transmits phoneme indices, state durations and F0 information to the decoder. In the decoder, phoneme HMMs are concatenated according to the phoneme indices, and a sequence of mel-cepstral coefficient v...

متن کامل

Syllable-based pitch encoding for low bit rate speech coding with recognition/synthesis architecture

Current HMM-based low bit rate speech coding systems work with phonetic vocoders. Pitch contour coding (on frame or phoneme level) is usually fairly orthogonal to other speech coding parameters. We make an assumption in our work that the speech signal contains supra-segmental cues. Hence, we present encoding of the pitch on the syllable level, used in the framework of a recognition/synthesis sp...

متن کامل

A very low bit rate speech coder using HMM with speaker adaptation

This paper describes a speaker adaptation technique for a phonetic vocoder based on HMM. In the vocoder, the encoder performs phoneme recognition and transmits phoneme indexes and state durations to the decoder, and the decoder synthesizes speech using HMM-based speech synthesis technique. One of the main problems of this vocoder is that the voice characteristics of synthetic speech depend on H...

متن کامل

Dynamic Unit Selection for Very Low Bit Rate Coding at 500 bits/sec

This paper presents a new unit selection process for Very Low Bit Rate speech encoding around 500 bits/sec. The encoding is based on speech recognition and speech synthesis technologies. The aim of this approach is to use at best the speech corpus of the speaker. The proposed solution uses HMM modelling for the recognition of elementary speech units. The HMM are first trained in an unsupervised...

متن کامل

A very low bit rate speech coder based on a recognition/synthesis paradigm

Recent studies have shown that a concatenative speech synthesis system with a large database produces more natural sounding speech. We apply this paradigm to the design of improved very low bit rate speech coders (sub 1000 b/s). The proposed speech coder consists of unit selection, prosody coding, prosody modification and waveform concatenation. The encoder selects the best unit sequence from a...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1998

A very low bit rate speech coder using HMM-based speech recognition/synthesis techniques

نویسندگان

چکیده

منابع مشابه

Improving the performance of HMM-based very low bit rate speech coding

Syllable-based pitch encoding for low bit rate speech coding with recognition/synthesis architecture

A very low bit rate speech coder using HMM with speaker adaptation

Dynamic Unit Selection for Very Low Bit Rate Coding at 500 bits/sec

A very low bit rate speech coder based on a recognition/synthesis paradigm

عنوان ژورنال:

اشتراک گذاری